Parallel Sparse Modified Gram-Schmidt QR Decomposition
نویسندگان
چکیده
We present a parallel algorithm for the QR decomposition with column pivoting of a sparse matrix by means of Modiied Gram-Schmidt orthogonalization. Nonzero elements of the matrix M to decompose are stored in a one-dimensional doubly linked list data structure. A strategy to reduce ll-in is discussed to get memory savings and decrease the computation times. As an application of QR decomposition, we describe the least squares problem. This algorithm has been designed for a message passing multiprocessor and we evaluate it on the Cray T3D supercomputer using the Harwell-Boeing sparse matrix collection.
منابع مشابه
Performance Analysis of Modified Gram-Schmidt Cholesky Implementation on 16 bits-DSP-chip
This paper focuses on the performance analysis of a linear system solving based on Cholesky decomposition and QR factorization, implemented on 16bits fixed-point DSP-chip (TMS320C6474). The classical method of Cholesky decomposition has the advantage of low execution time. However, the modified Gram-Schmidt QR factorization performs better in term of robustness against the round-off error propa...
متن کاملA Generalized Gram{schmidt Procedure for Parallel Applications
The Gram-Schmidt procedure is used to orthogonalize one vector against a set of vectors or to construct a QR factorization of a matrix. In the Classical Gram-Schmidt algorithm (CGS), orthogonal vectors are produced via matrix{vector updates, which is desirable for parallel computers. Unfortunately, this algorithm exhibits a poor numerical stability behavior and the loss of orthogonality cannot ...
متن کاملMatrix Decomposition
5 QR Decomposition 7 5.1 Householder Reflections and Givens Rotations . . . . . . . . . . . . . . . . . . . . . . . 8 5.2 Gram-Schmidt orthonormalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 5.3 QR Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 5.4 Least Square Fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....
متن کاملImplementation of LU Decomposition and QR Decomposition on Parallel Processing Systems
One of the earliest attempts to implement LU Decomposition with special purpose hardware was using systolic/wavefront arrays[2]. Different proposals for the processing elements(PEs) of systolic/wavefront arrays are provided[3][4][5]. These ideas were not implemented in circuit at that time. The performance of these architectures were not quantitatively evaluated either. In 1994, E. Casseau[6] i...
متن کاملOrthogononalization on a general purpose graphics processing unit with double double and quad double arithmetic
Our problem is to accurately solve linear systems on a general purpose graphics processing unit with double double and quad double arithmetic. The linear systems originate from the application of Newton’s method on polynomial systems. Newton’s method is applied as a corrector in a path following method, so the linear systems are solved in sequence and not simultaneously. One solution path may r...
متن کامل